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Circulating metabolites associated with insulin sensitivity may re- 
present useful biomarkers, but their causal role in insulin sensitiv- 
ity and diabetes is less certain. We previously identified novel 
metabolites correlated with insulin sensitivity measured by the 
hyperinsulinemic-euglycemic clamp. The top-ranking metabo- 
lites were in the glutathione and glycine biosynthesis pathways. 
We aimed to identify common genetic variants associated with 
metabolites in these pathways and test their role in insulin sen- 
sitivity and type 2 diabetes. With 1,004 nondiabetic individuals 
from the RISC study, we performed a genome-wide association 
study (GWAS) of 14 insulin sensitivity-related metabolites and 
one metabolite ratio. We replicated our results in the Botnia 
study (n = 342). We assessed the association of these variants 
with diabetes-related traits in GWAS meta-analyses (GENESIS [in- 
cluding RISC, EUGENE2, and Stanford], MAGIC, and DIAGRAM). 
We identified four associations with three metabolites — glycine 
(rs715 at CPS1), serine (rs478093 at PHGDH), and betaine 
(rs499368 at SLC6A12; rsl7823642 at BHMT) — and one associa- 
tion signal with glycine-to-serine ratio (rsll07366 at ALDH1L1). 
There was no robust evidence for association between these 
variants and insulin resistance or diabetes. Genetic variants as- 
sociated with genes in the glycine biosynthesis pathways do 
not provide consistent evidence for a role of glycine in diabetes- 
related traits. Diabetes 62:2141-2150, 2013 
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Using mass spectrometry-based metabolomic 
approaches, recent studies have identified asso- 
ciations between small molecules and insulin 
sensitivity and type 2 diabetes (1-6). Previous 
studies in the RISC (Relationship between Insulin Sensitivity 
and Cardiovascular disease risk) study identified novel 
associations between insulin sensitivity and small molecules 
including amino acids glycine, cysteine, isoleucine, and 
creatine and the organic acids a-hydroxybutyrate (a-HB) 
and a-ketobutyrate (a-KB). Glycine was the amino acid 
most strongly associated with increased insulin sensitivity 
(4) — a finding consistent with other studies (7-9). 

While some metabolites may represent important bio- 
markers, the causal directions of their associations with 
diabetes-related traits are uncertain. It is important to 
understand the causal role, or otherwise, of these mole- 
cules in order to avoid an increasingly confusing picture of 
which biomarkers are causal and which are secondary to 
the diabetes disease process. 

The identification of genetic variants strongly associated 
with metabolites may provide useful tools to help understand 
causal directions of correlated phenotypes. Genetic variants 
are unlikely to be influenced by disease processes or envi- 
ronmental factors and therefore provide robust tools in 
Mendelian randomization to assess causal directions of 
correlated phenotypes (10). Recently, the principle of Men- 
delian randomization has been used to provide evidence for 
a causal association between reduced B-type natriuretic 
peptide levels and type 2 diabetes (11) and reduced sex 
hormone-binding globulin levels and type 2 diabetes (12), 
but the approach provided no evidence for a causal re- 
lationship between raised triglycerides and increased insulin 
resistance (13). 

In this study, we focused on the associations of glycine 
and glutathione biosynthesis pathways with type 2 dia- 
betes because, apart from the strong correlations identi- 
fied in the RISC study, some other recent studies have 
provided evidence that high glycine level is associated 
with increased insulin sensitivity and decreased type 2 
diabetes risks (7-9). In addition, type 2 diabetic patients 
have unrestrained gluconeogenesis and severely deficient 
glutathione synthesis (14,15). Glycine supplementation can 
improve deficient glutathione synthesis in type 2 diabetic 
patients, and glutathione supplementation can improve 
insulin sensitivity in nondiabetic individuals (15,16). We 
hypothesized that glycine and glutathione pathways con- 
tribute to diabetes and insulin resistance. We aimed to 
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identify genetic variants influencing circulating levels of 
metabolites in the glycine and glutathione pathways. We 
tested these variants in Mendelian randomization analyses 
to examine the potential causal role of these metabolites in 
insulin resistance and type 2 diabetes. 

RESEARCH DESIGN AND METHODS 

We analyzed nondiabetic participants of European ancestry from four studies 
who provided DNA for genome-wide genotyping and underwent a direct 
measure of insulin sensitivity. The four studies include RISC (n = 957), Botnia 
(n = 341), EUGENE2 consortium (European network on Functional Genomics 
of type 2 diabetes; n = 577) and the Stanford Insulin Suppression Test (1ST) 
Cohort (n = 263). The descriptive characteristics of the RISC participants are 
shown in Table 1. In brief, we excluded individuals with cryptic relatedness 
using PLINK pairwise identity by descent estimation (PI_HAT >0.2). We ex- 
cluded individuals with lipid disorders or diabetes, lipid medications, preg- 
nancy, fasting plasma glucose S7.0 mmol/L, or 2-h plasma glucose (on a 75-g 
oral glucose tolerance test) sll.O mmol/L. The individual study character- 
istics, genotyping, and phenotyping details are provided in Supplementary 
Data. We performed genome-wide association studies (GWAS) for metabolites 
in the RISC study, replicated the GWAS findings in the Botnia study, and 
carried out Mendelian randomization analyses in RISC, EUGENE2, and 
Stanford 1ST to test the associations of genetic variants with insulin sensitivity. 
Selection and measurement of metabolites in RISC and Botnia studies. 
We selected 14 metabolites for GWAS. The metabolites were selected based on 
the study of Gall et al. (2010) (4). We selected metabolites that were both 
available in the RISC study and associated with insulin sensitivity (3,5). Details 
are shown in Table 2. 

We selected metabolite ratios from the 14 metabolites based on two criteria: 
T) the two metabolites were linked by one-step enzymatic reactions, and Z) the 
ratio was associated with insulin sensitivity measured by hyperinsulinemic- 
euglycemic clamp (M value). The glycine-to-serine ratio was the only one that 
satisfied both criteria (Fig. 2D). In both RISC and Botnia studies, metabolites were 
measured using multiple-platform mass spectrometry technology (ultra-high 
performance liquid chromatography and gas chromatography) as previously de- 
scribed (17-19). Absolute quantitation was performed for the 14 metabolites 
(Table 2) for the RISC study samples by UHPLC-MS/MS analysis (4,49). 
GWAS of metabolites and metabolite ratios in RISC. The plasma 
concentrations of metabolites were fitted in a linear regression model with ad- 
justment for age, sex, and centers. Then, the standardized residuals were nor- 
malized by inverse-normal transformation prior to GWAS. We performed GWAS 
with eachmetabolite using MACH2QTLbasedonan additive genetic model (20, 21). 

For the glycine-to-serine ratio, we log 10 transformed the ratio and then 
adjusted for age, sex, and center in linear regression analyses. We performed 
GWAS using MACH2QTL as with single metabolites' concentrations. 
Candidate-region association study of metabolites in RISC. Some of the 
key enzymes and transporters involved in the metabolism and transport of 
metabolites are known. We selected 34 genes for the fourteen metabolites, 
consisting of carrier-encoding or enzyme-encoding genes involved in the rate- 
limiting steps of the relevant biosynthetic pathways. The genes selected are 
listed in Table 2. We classified single nucleotide polymorphisms (SNPs) within 
300 kb of these genes as candidate SNPs. To prioritize SNPs for follow-up, we 
corrected for multiple testing of the total number of SNPs in each can- 
didate region (a conservative threshold, given the correlation between SNPs). 



However, we still used P value <3 X 10~ (5 X 10~ corrected for 15 tests 
[14 metabolites and one ratio]) in all available studies as the final criteria for 
association. 

Selection of SNPs for genotyping in Botnia and meta-analysis in RISC 
Single metabolites. We used two statistical thresholds to select SNPs for 
replication. First, we used P value <5 X 10~ 8 as the standard for genome-wide 
significance in the context of common SNPs. Second, for the 34 candidate genes 
(Table 2 and heseahch design and methods), we divided 0.05 by the total number 
of SNPs in the gene ±300 kb. We meta-analyzed SNP-metabolite results from 
RISC and Botnia using an inverse variance-weighted approach as implemented 
in STATA command "metan." For Mendelian randomization analyses, we used 
SNPs reaching P value <3 X 10~ 9 in the meta-analysis of the two studies. 
Metabolite ratio. To validate SNPs associated with the glycine-to-serine ratio, 
we linked results from the recently published Cooperative Health Research in 
the Region of Augsburg (KORA) and UKtwins studies (22) to our GWAS results. 
We meta-analyzed our results with those from the KORA or UKtwins studies 
with a significant threshold of P value <3 X 10~ 9 when including all available 
studies. 

Effects of associated SNPs on other metabolites in the glycine and 
glutathione biosynthesis pathways. We performed further analyses for the 
five SNPs associated with metabolites in the glycine or glutathione biosynthesis 
pathways, which include glycine, serine, betaine, a-HB, a-KB, and glycine-to- 
serine ratio. We tested the associations of each SNP against the other me- 
tabolite traits. We performed association analyses in the linear regression 
model described in method section 3 in STATA (version 10.1). 
Mendelian randomization analyses 

Association of metabolite-associated SNPs in glycine biosynthesis 
pathway with insulin sensitivity. We tested the role of metabolite-associated 
SNPs reaching genome-wide significance with two diabetes-related traits: 
hyperinsulinemic-euglycemic clamp (M value corrected for kilograms body 
weight), which was a measure of whole-body insulin sensitivity, and fast- 
ing insulin. M value-based measures of insulin sensitivity were corrected 
for age, sex, and center and converted to SD units; and inverse normalized. 
Fasting insulin was natural log transformed; corrected for age, sex, and 
center; and converted to SD units. Using RISC data, we calculated two 
estimates of the association between metabolite SNPs and diabetes-related 
traits for each metabolite trait. First, we calculated an estimated expected 
effect if there was a causal association between metabolites and diabetes- 
related measures, using a triangulation approach as shown in Supplementary 
Fig. 1: we calculated the correlation between standardized metabolite levels 
and the two diabetes-related traits. We then multiplied these standardized 
effects by that between metabolite SNPs and metabolites to estimate an 
approximate expected effect size of the association between metabolite 
SNPs and M value and fasting insulin. We calculated approximate expected 
95% CIs based on the observed effects and SEs using the Taylor series 
expansion of the ratio of two means (23). Second, we tested the observed 
effect between metabolite SNPs and the two diabetes-related traits. For the 
clamp-based measures of insulin sensitivity, we used three studies (RISC, 
EUGENE2, and Stanford 1ST) and meta-analyzed results using the program 
METAL (24). In EUGENE2, insulin sensitivity was measured using the same 
hyperinsulinemic-euglycemic clamp-based protocol as that used by RISC 
(25). In the Stanford study, insulin sensitivity was measured by steady-state 
plasma glucose method. The steady-state plasma glucose value is highly in- 
versely correlated to M value (r = —0.93, P < 0.001) (26), so meta-analyses 
were performed between the three studies by reversing the signs of the effect 
sizes in Stanford. 



TABLE 1 



Summary details of RISC individuals and relevant characteristics 





Age 


BMI 


FI 


M value (jjimol/kg 


Units 


(years) 


(kg/m 2 ) 


(pmol/L) 


body wt/min)* 


N 


1,004 


1,004 


973 


1,004 


Mean 


43.91 


25.42 


34.3 


39.84 


SD of mean 


8.37 


4.04 


18.55 


16.2 


Median 


44 


24.9 


30 


38.33 


Minimum 


30 


16.9 


3 


4.92 


Maximum 


61 


43.9 


116 


114.25 


Correlation with age 


1 








Correlation with BMI 


r = 0.30; P = 8.3 X 10" 6 








Correlation with FI 


r = 0.01; P = 0.31 


r = 0.11; P = 1.2 X 10" 68 






Correlation with M value 


r = -0.04; P = 0.01 









FI, fasting insulin. *M value for the clamp expressed per kilogram body weight. 
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TABLE 2 

Fourteen metabolites studied in GWAS and candidate genes in RISC (n = 1,004) 













Correlation with 




Metabolites 




Mean 








M value 




Og/mL) 


Candidate genes 


(minimum-maximum) 


SD 


Median 


V 


p 




. TJT3 

a-tiB 


T 1~\U A T 1 1 1 1 1 > T 1 \ J 1 1 ' T M / / M ™ 1 1 1 > 1 i 1 1 

LDtlA, LDtiU, LDHC, LUtiD, a-tWDti 


a ca c\ An i o on> 
4.01) (l.uy— lo.oD) 


1 1EL 
1. 10 




-0.35 


1.5 X 10" 


30 


Adrenate 




A <>A f(\ aa i iq^ 

u.zu (u.uo— i.iyj 


A AO 

u.uy 


A 1 Q 

U. lo 


-0.19 


9.0 X 10" 


-10 


a-Ketoglutaric acid 




i An fc\ nn o no"\ 
1.U9 (U.UU— <L.\)iL ) 


a on 


1 AO 

i.uy 


-0.21 


1.4 X 10" 


11 




Coo 


U.oo (U.UU— l.loj 


A <?1 

U.Z1 


A Q£ 
U.OO 


-0.28 


4.4 X 10" 


-20 


Betaine 


I > ! 1 1 J'!' / 1 1 1 \ 1 1 

utiivii , chdh 


/i oc n ic 11 n^ 
4.^0 (1.1b— 11. yZ) 


1 QQ 
l.OO 


A to 

4. Id 


0.06 


6.6 X 10" 


11 


Decanoylcarnitine 


CPT1C, SLC25A20 


0.04 (0-0.35) 


0.04 


0.03 


0.14 


1.3 X 10" 


-5 


Glutamate 


NAGS, SIRT4, GLUD1 


18.19 (4.99-100.26) 


12.21 


14.38 


0.06 


0.06 




Glycine 


SHMT1/2, GLDC, GCSH 


17.35 (7.05^1.53) 


5.26 


16.12 


0.24 


7.2 X 10" 


-15 


Ketovaline 


BCAT2, BCKDHA, BCKDHB 


1.59 (0.12-2.99) 


0.40 


1.62 


-0.23 


7.1 X 10" 


-14 
21 


Linoleoyl-GPC 


PLA2G5, PLA2G12A, PLA2G2D 


15.65 (5.24-39.89) 


5.18 


15.15 


0.29 


2.4 X 10" 


Oleate 


OLAH, ACSL1 


85.14 (11.91-569.89) 


36.67 


81.57 


-0.17 


3.2 X 10" 


-8 


Oleoyl-GPC 


OLAH, LCLAT, PLD1 


9.81 (3.14-22.64) 


2.93 


9.49 


0.27 


1.2 X 10" 


-18 




PSPH, PHGDH, CBS, SDS, SHMT2 








0.14 




-6 



For fasting insulin measures of insulin sensitivity, we used data from the 
Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC), 
consisting of a meta-analysis of 23 GWAS with 27,589 individuals (27). 
Association of metabolite-associated SNPs with type 2 diabetes. We 
used data from the Diabetes Genetics Replication and Meta-analysis Consor- 
tium (DIAGRAM) to assess the association of metabolite SNPs with type 2 
diabetes. These data come from 8,130 patients with type 2 diabetes and 38,987 
control subjects from eight GWAS (28). 

Instrumental variable analysis. In this RISC study, where we had measures 
of SNPs, metabolites, and insulin sensitivity, we performed instrumental var- 
iable analyses using the two-stage least squares regression approach imple- 
mented in the STATA command "ivreg2." Using the two-stage least squares 
regression approach, the instrumental variable estimator (3^ provides an es- 
timate of the causal effects of exposure (i.e., metabolites) on outcome (i.e., 
insulin sensitivity) even in the presence of unmeasured confounders (10,29). 



RESULTS 

GWAS and replication of insulin sensitivity-related 
metabolites. In the RISC study, we identified eight sig- 
nals of interest either at genome-wide significance or 
reaching a locus-wide nominal level of significance around 
one of the candidate genes (Supplementary Table 2). One 
of these signals represented a widely reported association 
between SNPs in the FADs gene cluster and fatty acids 
(27,30,31) (in our case adrenate), and we did not pursue 
this association further. We successfully genotyped SNPs 
representing six of the remaining seven signals in the 
Botnia study (Table 3). 

After meta-analysis of RISC and Botnia data (where 
available), we identified four association signals with three 
separate single metabolites — the amino acids glycine, 
serine, and betaine — at P value <3 X 10~ 9 . For ratios of 
metabolites, we identified one signal for glycine-to-serine 
ratio that when meta-analyzed with published KORA 
data reached P value <3 X 10~ 9 . Details of the associations 
are given in Table 3 and Fig. 1. Two SNPs associated with 
serine were taken forward from the RISC GWAS but did 
not replicate in the Botnia study (Table 3). 
Novel associations between SNPs in two loci and 
betaine levels. We identified an association between 
rs499368 in the SLC6A12 gene and betaine levels (P value 
1.46 X 10~ 10 ) (Fig. L4). This signal has not previously been 
reported with any other trait and was not captured at r 2 > 
0.8 in the published KORA or UKtwins data. The second 
association occurred between rsl 7823642 near a candidate 
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gene, BHMT, and betaine levels (P value 2.3 X 10" 9 ) (Fig. 

IB) . This signal has not previously been reported with any 
other trait at genome-wide significance but was captured 
at r 2 = 1.0 by rs7732845 in the published KORA data (but 
not UKtwins), and a meta-analysis of RISC, Botnia, and 
KORA data (P value for KORA alone 1.98 X 10" 6 ) confirms 
very robust evidence of association (meta-analyzed 
P value 6.07 X 10" 14 ). 

SNP in a known locus is associated with glycine 
levels. We identified an association between rs715 in the 
3' untranslated region of the CPS1 gene and glycine levels 
at genome-wide significance (P value 3.30 X 10 -50 ) (Fig. 

IC) . This signal was not captured at r 2 >0.8 in the pub- 
lished KORA or UKtwins data, but a SNP (rs4673558) with 
an r 2 = 0.21 with rs715 is associated with glycine levels 
with P value 4.3 X 10" 11 in the UKtwins data. 

SNP in a known locus is associated with serine 
levels. We identified an association between rs478093 near 
the PHGDH gene and serine levels at genome-wide signifi- 
cance (P value 1.52 X 10 -9 ) (SNP not available in Botnia 
study) (Fig. ID). This signal was previously reported as 
associated with serine and ratios of metabolites involving 
serine in the published KORA and UKtwins data (based 
on rs477992 [r 2 = 0.93], meta-analyzed P value 1.94 X 
10" 14 ) (22). 

SNP in a novel locus is associated with glycine-to- 
serine ratios. We identified a previously unreported as- 
sociation between rsl 107366 near the ALDH1L1 gene and 
glycine-to-serine ratios (P value 2.25 X 10~ 6 ) (Fig. IE). 
This signal reached genome-wide significance in combi- 
nation with data from the KORA and UKtwins studies 
(meta-analyzed P value 2.8 X 10" 12 ) (Table 3). 
Association between rs715 in CPS1 and glycine 
levels is highly sex specific. The SNPs in the CPS1 lo- 
cus have been previously reported with sex-specific effect 
on glycine and homocysteine levels (rs7422339, = 0.92 
with rs715) (32,33). We observed a similar sex-specific 
association between this signal and glycine levels (Sup- 
plementary Fig. 2). The association was weak in males 
(P = -0.19 [95% CI -0.31 to -0.08]; P value = 1.1 X 10" 3 ) 
but more than four times the effect size in females ((3 = 
-0.84 [-0.98 to -0.69]; P value 4.5 X 10" 28 ). The Z test 
for the null hypothesis of no sex-specific effect was 
rejected atP value 2.67 X 10~ 13 . This result was consistent 
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with the female-specific association previously reported 
(32,33). There was no evidence that the association was 
different between pre- and postmenopausal women (pre- 
menopausal, effect -0.78 [-0.95 to -0.61]; P value 1.57 X 
10" 19 ; postmenopausal, -1.04 [-1.35 to -0.73]; P value 
4.57 X 10~ n ). We did not observe any evidence of sex- 
specific effects for the other metabolite SNPs. 
Effects of metabolite-associated SNPs on other metab- 
olites in the glycine and glutathione biosynthesis path- 
ways. The effects of the five confirmed signals on the other 
four single metabolites levels and glycine-to-serine ratio are 
shown in Table 4, with the last column showing the effects 
of metabolite-insulin sensitivity associations. 
Associations of metabolite-associated SNPs with fast- 
ing insulin-based and clamp-based measures of insulin 
sensitivity. Associations between the five metabolite- 
associated SNPs, metabolite levels, and diabetes-related 
traits (fasting insulin and hyperinsulinemic-euglycemic 
clamp [M value]) are shown in Table 5. There were strong 
correlations between fasting insulin and metabolite levels 
(0.10 < r < 0.20), as was expected, given that the me- 
tabolites were selected as those correlated with a measure 
of insulin sensitivity. These strong correlations between 
phenotypes meant that the MAGIC data had >90% power to 
detect associations at P value < 0.01 based on the estimated 
expected effects between metabolite SNPs and fasting in- 
sulin. However, there were no associations between me- 
tabolite SNPs and fasting insulin. The effect sizes observed 
in the MAGIC data were all smaller than those expected 
based on the triangulation calculations. 

We identified a nominal association between the 
glycine-to-serine ratio-associated SNP rsl 107366 near 
ALDH1L1 and clamp-based measures of insulin sensi- 
tivity ((3 = 0.09 SD [95% CI 0.03-0.15], where the allele 
that raises glycine-to-serine ratios increases clamp-based 
insulin sensitivity P value 0.005) (Table 5 and Supple- 
mentary Table 3). The observed effect size on insulin 
sensitivity (IS) was larger than the expected effect 
(expected p SNP _i S = 0.03 SD [95% CI 0.01-0.04]). Results of 
instrumental variable analyses in RISC were consistent 
with the main results: the glycine-to-serine ratio predicted 
from the rsl 107366 genotypes is associated with clamp- 
based insulin sensitivity ((3^ = 1.00 [0.24-1.76]; P value 
0.01). No other metabolite SNPs were associated with 
clamp-based measures of insulin sensitivity in either the 
triangulation analyses or the instrumental variable anal- 
yses, including the other glycine and serine signals (rs715 
and rs478093). 

Association of metabolite SNPs with type 2 diabetes. 

There was no evidence of association between four of 
the five metabolite SNPs and type 2 diabetes, based on 
the meta-analysis of case-control studies reported by the 
DIAGRAM study (Supplementary Table 4). The rs715 SNP 
in CPS1 associated with glycine levels was poorly captured 
in the type 2 diabetes GWAS meta-analysis (28). 



DISCUSSION 

Using a genome-wide approach, we have identified five 
associations between genetic variants and circulating levels 
of three metabolites and one metabolite ratio. These metab- 
olites occur in pathways strongly correlated with the gold 
standard measure of insulin sensitivity (hyperinsulinemic- 
euglycemic clamp) in the RISC study — primarily, the 
glycine biosynthesis pathway. Three of these associations 
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FIG. 1. Regional association plots of the five SNP-metabolite associations in the RISC cohort. In each plot, the top panel shows the name and 
location of genes in the UCSC Genome Browser. The -log 10 of P values of the imputed SNPs are plotted on the i/-axis against genomic position 
(NCBI Build 36) on the jc-axis. The top signal is represented by a diamond. Estimated recombination rates (taken from HapMap) are plotted to 
reflect the local linkage disequilibrium structure around the associated SNPs and their correlated proxies (according to a gray scale from r 2 = 0 to 
1, based on pairwise r 2 values from HapMap Phase II CEU). chr, chromosome. (A high-quality color representation of this figure is available in the online 
issue.) 



have been previously identified at genome-wide levels of 
significance. We have tested the association of these var- 
iants with insulin sensitivity using the largest collective set 
of studies with these measures, including RISC, EUGENE2, 
and Stanford. 

Amino acid-associated SNPs and diabetes-related 
traits. Glycine and serine are glycogenic amino acids in- 
volved in hepatic gluconeogenesis and glutathione bio- 
synthesis, which are potentially important pathways in 
diabetes and insulin resistance. In the RISC study, glycine, 
serine, and betaine were positively correlated with clamp- 
based measures of insulin sensitivity and negatively cor- 
related with fasting insulin. These associations are in line 
with previous findings (7-9,34). 

Using Mendelian randomization analyses, we assessed 
whether these amino acids play a causal role in insulin 
sensitivity and type 2 diabetes risks. Our Mendelian ran- 
domization analyses do not support a causal association 
between genetically changed glycine, serine, and betaine 
levels and insulin sensitivity levels as measured by fasting 
insulin or type 2 diabetes. However, for the clamp-based 
measures of insulin sensitivity we observed a suggestive 
association of a glycine-to-serine ratio-associated SNP, 
rsl 107366 (near the ALDH1L1 gene). The allele that raises 
glycine-to-serine ratios increases clamp-based insulin 
sensitivity. The rsll07366-insulin sensitivity association 
needs further replication in a larger sample size, especially 

diabetes.diabetesjournals.org 



given that other glycine or serine-associated SNPs (e.g., 
rs715 and rs478093) are not associated with insulin sensi- 
tivity. The rs715 variant in CPS1, for example, explains 
a greater proportion of the variance in glycine levels 
(—13% compared with —2% for rsl 107366). It is also pos- 
sible that rsl 107366 could influence insulin sensitivity via 
non-glycine-mediated (pleiotropic) effects. 

In an insulin-resistant state, the increase of hepatic 
gluconeogenesis would result in greater consumption of 
glycogenic amino acids, which may be accentuated in 
individuals with genetically influenced lower levels of 
these molecules and is consistent with the hypothesis of 
reverse causality (35). However, causal mechanisms in 
both directions remain plausible because gluconeogenesis 
is controlled in many different ways. 
Biology of metabolite levels. Our data highlight some 
candidate genes and protein products important in con- 
trolling circulating metabolite levels. At each locus, there 
is a clear candidate gene, although we cannot be certain 
which gene is affected by the associated SNP. 
Betaine and serine signals are in or near function- 
ally relevant genes. The SLC6A12 gene is a highly 
plausible candidate for influencing betaine levels. Betaine 
is an osmolyte used by cells for protection against hyper- 
osmotic environments (36), and SLC6A12 encodes a 
highly conserved osmoregulator, which controls cellu- 
lar volume by extrusion of betaine (37,38). Previous 
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studies have shown that hyperosmolarity could induce 
insulin resistance by impairing insulin receptor substrate 
(IRS)-l tyrosine phosphorylation and degradation of IRS 1 
and IRS2 in adipocytes (39,40). This connection may 
partly explain the association of betaine with insulin 
sensitivity. 

The BHMT gene encodes a cytosolic enzyme that catal- 
yses the conversion of betaine and homocysteine to dime- 
thylglycine and methionine, respectively. The rsl7823642 
SNP is highly correlated with three SNPs associated with 
BHMT enzyme activity and protein level (rs41272270, r 2 = 1; 
rsl6876512, r 2 = 0.925; and rs6875201, f = 0.925) (41). 

The previously described serine GWAS signal rs478093 
is in PHGDH (also known as PGDH [see 3-PGDH in Fig. 
2A]), the gene product of which catalyzes the first and rate- 
limiting step in the phosphorylation pathway of serine 
biosynthesis. 

Effect of variation at the CPS1 locus on glycine level. 

The rs715 SNP represents the same association signal as 
that identified between the SNP rs22 16405 near CPS1 (r 2 = 
0.47 with rs715) and glycine levels (31), but the variance in 
glycine levels explained by rs715 (12.87%) in our study 
compared with the variance explained by the rs22 16405 
SNP (8.64%) suggests that rs715 is a better marker for the 
causal variant. A recent study reported an association 
between rs715 and glycine levels specific to females, 
consistent with our results (32). 

The enzyme encoded by CPS1 catalyzes synthesis of 
carbamoyl phosphate from ammonia and bicarbonate (Fig. 
25). Patients with defects in the function or expression of 
CPS1 suffer from life-threatening hyperammonemia (42). It 
is possible that variants at this locus perturb the conver- 
sion of ammonia and bicarbonate to carbamoyl phosphate. 
We hypothesize that CPS1 variants may cause excess 
ammonia, which may then lead to increased production of 
glycine and tetrahydrofolate in the glycine cleavage system 
(Figs. 2D and 3). 

ALDH1L1 as a candidate enzyme involved in glycine 
metabolism. Glycine is a key component of the folate 
pathway (Fig. 3). The protein product of ALDH1L1 cata- 
lyzes the conversion of 10-formyltetrahydrofolate, NADP, 
and water to tetrahydrofolate, NADPH, and carbon dioxide 
(Fig. 2(7). Our association between a SNP near the ALDH1L1 
gene and glycine-to-serine ratio implicates this enzyme in 
glycine/serine conversion rate. This is in accordance with 
the knowledge that the glycine cleavage system, which 
accounts for —41% of whole-body glycine flux, is tightly 
linked with tetrahydrofolate in folate metabolism (43) (Fig. 
2D and 3). 

Sex-specific effect at CPS1 locus on glycine and 
homocysteine levels. We observed a sex-specific asso- 
ciation of CPS1 variants on glycine levels. This is consis- 
tent with the findings of Mittelstrass et al. (2011) (32). The 
variants at the CPS1 locus also have female-specific effects 
on homocysteine levels (33), and the glycine-raising allele 
is associated with raised homocysteine (44). 
Links between glycine, serine, homocysteine, and 
betaine in folate and homocysteine metabolism. 
Glycine, serine, and betaine are linked to homocysteine 
and folate metabolism (44,45) (Fig. 3). The CPS1 variant 
rs715 is strongly correlated with rs7422339 (r 2 = 0.92), 
which was previously reported to be associated with 
homocysteine and folate levels (33). House, O'Connor, and 
Guenter (44) demonstrated that the plasma concentrations 
of homocysteine, glycine, and serine were all elevated in 
folate-deficient rats. From the link between betaine and 
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FIG. 2. Schematics of metabolic pathways relevant to SNP -metabolite associations. PSAT, phosphoserine aminotransferase. 



homocysteine in the reaction catalyzed by BHMT, we hy- 
pothesized that if rs 17823642 is associated with betaine 
level through reduced functioning of BHMT, then it would 
result in elevation of not only betaine but also homocysteine 
levels (46). We assessed the effect of rsl7823642 on homo- 
cysteine levels in an independent European study (Invec- 
chiare in Chianti, aging in the Chianti area [InCHIANTI]). We 
observed a nominal association in females (n = 575; |3 = 0.26 



[95% CI 0.07-0.44]; P value 5.9 X 10" 3 ) but not in males (n = 
458; p = -0.02 [-0.25 to 0.21]; P value 0.87), where the 
betaine-raising allele also correlated with increased homo- 
cysteine levels. However, this association requires further 
confirmation with a larger sample size. 
Limitations. There are a number of limitations in our 
study. First, the triangulation approach for estimating ex- 
pected effects does not take into account the complicated 




FIG. 3. Links between glycine, serine, folate, homocysteine, and betaine in folate metabolism and homocysteine metabolism. Enzymes (1): dihy- 
drofolate reductase (2), serine hydroxymethyltransferase (3), glycine synthase (also called glycine cleavage enzyme) (4), methylenetetradydrofolate 
reductase (5), methionine synthase (the other name of 5-methyltetrahydrofolate-homocysteine methyltransferase) (6), and betaine-homocysteine 
methyltransferase. Modified from House et al. (44) and Van Tellingen et al. (45). (A high-quality color representation of this figure is available in the 
online issue.) 
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feedback mechanisms and interactions involved in con- 
trolling metabolite levels. The SNPs that we identified are 
associated with several metabolites (Table 4), which means 
that they do not provide specific instruments for one me- 
tabolite. However, the associations for any one SNP are all 
with metabolites closely connected in well-annotated path- 
ways and therefore provide an instrument to test the 
relationship between alterations in those pathways and 
diabetes-related outcomes. 

Second, our estimated effects are approximate, with 
metabolite levels only measured in RISC and observed 
estimates coming from separate studies. Nevertheless, the 
size of the MAGIC study provided very good power to see 
the expected very small effects on fasting insulin levels. 

Finally, we have only been able to assess some of the 
many metabolites associated with insulin sensitivity. The 
metabolites selected in our study were those most strongly 
associated with clamp-measured insulin sensitivity in the 
RISC study (4), and we focused efforts on these metabolite 
traits. Some recent studies have reported other metabo- 
lites associated with dysglycemia (e.g., branched chain 
and aromatic amino acids) (3,5,47,48), but these metabo- 
lites have not been measured in RISC study. 

In conclusion, our study provides novel insight into the 
genetic regulation of metabolite levels, particularly those 
involved in the glycine-related pathways closely correlated 
to insulin sensitivity. Genetic variants associated with 
metabolite levels provide an important approach to help- 
ing unravel the functional role of the metabolic pathways 
that influence diabetes-related traits. 
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